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NETWORKING SYSTEMS 



The present invention relates to networking systems and the forwarding and routing of information therein, 
being mnre particularly directed to the problems of a common method for managing both cell and packet or frame 
switching in the same device, having common hardware, common QoS (Quality of Service) algorithms, common 
forwarding algorithms; building a switch that handles frame switching without interfering with cell switching. 

Background of Invention 

Two architectures driving networking solutions arc cell switching and frame forwarding. Cell switching 
involves the transmission of data-in fixed size units called cells. This is based on technology referred to as 
Asynchronous Transfer Mode (ATM). Frame forwarding transmits data in arbitrary size units referred to either as 
frames or packets. The basis of frame forwarding is used by a variety of protocols, the most noteworthy being the 
Internet Protocol (IP) suite. 

The present invention is concerned with forwarding cells and frames in a common system utilizing common 
forwarding algorithms. In co-pending U.S. patent application Serial No. 581.467. filed DecembcV 29.1995. for High 
Performance Universal Multi-Ported Internally Cached Dynamic Random Access Memory System. Architecture and 
Method, and co-pending U.S. patent application Serial No. 900.757, Tiled July 25. 1997. for System Architecture for 
and Method of Dual Path Data Processing and Management or Packets and/or Cells and the Like, both of common 
assignee herewith, a promising solution or common cell/frame forwarding is provided. 

Most traditional Internet-style hosi-to*host data communication is carried out in variable sue packet format, 
interconnected by networks (defined as a collection of switches) using packet switches called routers. Recently. 
ATM has become widely available as a technology to move data between hosts, having been developed to provide a 
common method for sending traditional telephony data as well as data for computer-to-computcr communication. 
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T>,c previous method employed was l0 3pply Timc Dlv , io „ Mu||ip| „ in? ^ ^ ^ ^ ^ ^ 

.Itemed a fixed amount of time on n channel Tor example, circuit A may be aHocated x arnounl of limc (and ,„ us 
data), fo.lowed by y and , 3nd lhcn s 3gai , as l3Icr dcscri „ cd jn conncc w . h here . naf(cr djscusjcd ^ ^ ^ 

each circuit is completely synchronous. This method, however. h3s imrinsic limilations u . „ bandw . dih u . ijH(jon 
since if, circuit h , s nolhing 10 scnd ils a|localcd bandwidih . ^ uj£d Qn A ^ ^ 

issue by a.lowing ,he circuits to be asynchronous. Uough bandwidth is sti,. divided among fixed length dall item , 

any circuit can transmit at any point in timc. 

The ITU-T (International Telecommunications Union - Telecommunications. formally ,he CC1TT). is an 
organization chartered by the United Nations to provide ^communications standards defined four c.asses of 
service: „Cons,ant Bit Rate for Ctrcui, Emulation, i.e. constant, voice and video: 2, Variable Bi, Rate for 
cer.nin voice .„ d „ idco nr , plicmion , „ D31a for Connection-Orientcd Tr nf fic; and 4, D.t. for Connectionless- 
Oriented Traffic. These service., in ,u,n. ate supported by certain "classes" of ATM traffic. ATM moves data i„ 

Layers (AAL). these ATM adaptation .ayers are defined in ,TU-T Recommendation ,.363. There are 3 defined 
•ypes: AAL, . AAL3AJ , nd AAL5. A AU h. never been defined in the ,TU-T recommendations and AAL 3 and 
AAL 4 were combined into one , )T c. w„h respect to the ATM ccN make-up. there is no way ,o distinguish ce,,s that 
belong to one layer a. opposed to cells that belong to another layer. 

H* adaptation lavcr is determined during circuit setup: i.e. when a host computer communicates to the 
network. At this time, the h„„ cornier informs the network of the layer i, wi„ US c for a specific vir.ua, circuit 
AAL, has been defined to be used for rea,-,ime app.ications such as voice or video; while AAL5 has been defined 
for use by traditional datagram oriented services such as forcing ,P datagrams. A series of AAL5 ce„s „e 
defined to make up a packet. The definition or an AAL5 packet consists of a stream of cells with the PTT bit set lo 
0. except for the .as, one (as ,a,er iHustrated i„ Fig. , , This is referred ,o as , segmented packet. 

Thus, in current networking technology data , transported in either variab.c size packets or fixed *e ecl, 

or-oughAT. networks, conncceddirccf, then packets arc s,c: bu, if connected by ATM switches, 

.hen a., packets exi.ing „,e ,., UICr 3rc c ^ .„,„ ^ ^ ^ q{ ^ 
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Network architectures based on (he Interne! Protocol (IP) technology are designed as a "best effort" 
service. This means that if bandwidth is available, the data gets through. If. on the other hand, bandwidth is not 
available then the data is dropped. This works well with most computer data applications such as file transfers or 
remote terminal access. This does not work well with applications that can not retransmit, or where retransmission is 
of no value, such as with video and voice. Getting a video frame out of order makes no sense, whereas file transfer 
applications can tolerate such anomalies. Since the packet size is arbitrary at any point in time making specific delay 
variation commitments between any two frames is almost impossible, as there is no way of predicting what type and 
size of traffic is ahead of any other type of traffic. The buffers that must handle the data, moreover, must be able to 
receive the maximum data size, meaning that that buffering scheme must be optimized to handle larger data packets 
while at the same time not wasting too much memory on smaller packets. 

ATM is designed to provide several service categories for different applications. These include Constant Bit 
Rate (CBR). Available Bit Rate (ABR). Unspecified Bit Rate (UBR) and two versions or Variable Bit Rale (VBR). 
real-time and non-real-limc. These service categories are defined in terms of Traffic Parameters and QoS Parameters. 
Traffic Parameters include Peak Cell Rale (constant bandwidth). Sustainable Cell Rate (SCR). Maximum Burst Size 
(MBS). Minimum Cell Rate (MCR) and Cell Delay Variance Tolerance. QoS parameters include Cell Delay 
Variation (CDV). Cell Loss Ratio (CLR) and maximum Cell Transfer Delay (maxCTD). As an example. Constant 
Bit Rate CBR (e.g. the service used for voice and video applications) is defined as a service category that allows the 
user a. call setup lime to specify the PCR (peak cell rate, essentially the bandwidth), the CDV. maxCTD and CLR. 
The network must then ensure that the values requested by the user and accepted by the network are met; if they are 
met. the network is said to be supporting CBR. 

The various classes of service direct the network to provide better service for some traffic as opposed to 
other types of traffic. In ATM. with fixed length cells, switches manage bandwidth utilization on a line effectively by 
controlling the amount of data each traffic flow is allowed to put on a line at any moment in time. They generally 
have simpler buffer techniques arising from the fact that there is but one size of data unit. Another advantage is 
predictable network delays, especially queuing latencies at each switch. Since all data units are .he. same this 
helps to ensure that such traffic QoS parameters as CDV are easily measurable in the network. In non-ATM 
networks (i.e. frame-based networks), frames can range anywhere from. say. 40 bytes to thousands of bytes, 
rendering i, difficult to ensure a consistent CDV (or PDV. Packet Delay Variation) since it is impossible to predict 
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ihc delays in the network. lacking consistent transfer limes of individual packets. 

By carving data into smaller units. ATM can increase the ability of the network to decrease the latency of 
transmitting data from one host to another. Such also allows for easier queue and buffer management at each hop 
through the network. A disadvantage, however, is that a header is added to each cell making the effective bandwidth 
cf the network less than if the network had a larger transmission unit. For example, if 1 .000 bytes arc to be 
transferred from one host to another, then a framc-bascd solution would append a header (approximately 4 bytes) 
and transmit the entire frame in less than a second. In ATM. the 1 .000 bytes is chopped into 48 bytes with a 5 bytes 
header: i.e. 1 .000/48 = 20.833 (or 21 cells). Each cell is then given a 5 byte header increasing the bytes to be 
transmitted by 5 * 2 I = 105 extra bytes. Thus ATM effectively decreases the available bandwidth to the actual data 
by approximately 100 bytes (or about I09fe>: the decreasing of end-to-end latency also decreases the available 
bandwidth for data transmission. 

Por some applications, such as video and voice, latency is more important than bandwidth while for other 
application,, such as file transfers, better bandwidth utilization increases performance rather than decreased hop-by- 

hop latency. 

Recently, the demands or7 more bandwidth and QoS have grown many fold due to new applications for 
multimedia services, including the before described video and voice. This is forcing the growth of ATM networks in 
the core or traditional paefcebased networks. ATM. because of its fixed packet size, brings reduced processing lime 
in networks and hence faster forwarding (i.e. lower latency). I. also brings with it the ability to take advantage of 
traffic deification. S.ncc the cells, as earlier pointed out. arc of fixed size, traffic patterns can be controlled through 
QoS alignments: i.e. networks can carry traditional packets (in cell format) and constant bandwidth'stteam data 
(e.g. vnice/vidco based data). 

As will subsequently he demonstrated, most conventional networking systems inherently are designed for 
either forwarding frames or cells hut no. both. In accordance xvith the present invention, on the other hand, through 
use nf novel search algorithms. QoS management and management of packet/cell architecture, both cells and frame* 
can be transmitted in the same device and with significant advantage over the prior techniques, as later more fully 

explained 
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Objects of Invention 

An object of the present invention, accordingly, is to provide a novel system architecture and method, 
useful with any technique for processing data packets and/or cells simultaneously with data packets, and without 
impacting the performance aspects of cell forwarding characteristics. 

A further object is to provide such a novel architecture in which the architected switch can serve as a packet 
switch in one application and as a cell switch in another application, using the same hardware and software. 

Still a further object is to provide such a system wherein improved results are achieved in managing QoS 
characteristics for both cells and data packets simultaneously based on a common cell/data packets algorithm. 

An additional object is to provide a common parsing algorithm for forwarding cells and data packets using 
common and similar techniques. 

Other and further objects will be explained hereinafter, and arc more particularly delineated in the appended 

claims. 
Summary 

In summary, from one of its important viewpoints, the invention encompasses in a data networking system 
wherein data is received as either ATM cells or arbitrarily-sized multi-protocol frames from a plurality of I/O 
modules any of which can be cell or frame interfaces, a method of processing both ATM cells or such frames in a 
native mode, i.e. not transforming frames to cells, using common algorithms for forwarding based on control 
information contained in the cell or frame and in such a manner as to preserve QoS characteristics necessary for 
correct operation of cell forwarding; processing the packet/cell control information in a forwarding engine with 
common algorithms not dependent on context-sensitive information contained in the cell or packet, and passing 
results including QoS information to an egress queue manager; passing the cell/ packet to the egress I/O transmit 
facility in such a manner as to provide a minimal cell delay variation (CDV) so as not to impact correct eel! 
forwarding characteristics; and controlling the transmit facility so as to provide a common bandwidth management 
algorithm for both cell and packets and all without impacting the correct operation of either cells or packets. 
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Drawings 

The invention will now be described in connection with the accompanying drawings in which the before- 
mentioned Fig. I is a diagram illustrating an ATM (Asynchronous Transfer Mode) cell format; 

Fig.2 is a similar diagram of an Internet Protocol (IP) frame format for 32 bit words; 

Fig.3 is a flowchart comparing Time-Division Multiplexing (TDM), ATM and Packet Data frame 
forwarding; 

Fig. 4 is a block diagram of the switch of the invention with the cell and packet interfaces; 
. Fig. 5 is a block diagram of a traditional prior art bus-based switching architecture, and Fig. 6, its memory- 
based switch data flow diagram; 

Fig. 7 is a block diagram of a traditional prior art cross-bar type switching architecture, and Fig. 8, its cross- 
bar data flow diagram; 

Fig. 9- 10 arc interface diagrams illustrating, respectively, a cell switch with a native interface card, a packet 
interface on cell switch, and an AAL5 packet interface on cell switch, all with a cross-bar or memory switch; 

Figs. 12 and 13 are similar diagrams of a packet switch with native packet interface cards and with AAL5 
interface, respectively, for N.xN memory connection buses; 

Fig. 14 is a block diagram of the switch architecture of the present invention, using the word "NeoN" in 
connection with the packet and cell data switch as a trade name of NeoNET LLC, the assignee of the present 
application; 

Fig. 1 5 and 1 6 are diagrams respectively of extended parsing function flows for forwarding decisions and 
an overview of such functions and Fig. 17 is a diagram of the forwarding elements; 

Fig. 1 8 is a first stage parse graph tree lookup block diagram, and Fig. 19 is a second stage forwarding table 
lookup (FLT) diagram; 

Figs. 20 and 21 arc respective diagrams of parse graph memory on power up and of a simple illustrative IP 
multicast packet; 

Fig. 22 presents an initialized lookup table, with all entries pointing lo unknown route/cell forwarding 
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information, and Fig. 23 !!.„„„,„ ,hc .ookup .able af.er adding an illustrative IP address (20 o.6.34.224/32>: and 
Tig. 24 is n queuing di.igram for scheduling sys.em operation. 

Further Background To Pr eferred Embodiments nf Inv™.;„„ 

Before proceed,, ,„ iUus.ra.e the preferred architecture of the invention, i, is be.ieved necessary ,o review 
the limi.a.ions of the prior and of current network systems, which the present invention admirably overcomes. 

Omen, ncwo-King solutions arc designed cither for switching data packets or cel.s. As before stated. ,H 
types of da.n networking switches must receive da., on an ingress po... make a folding decision, transfer da„ 
from the ingress port ,o ,he egress port and transmit that da, a on ,he .appropriate egress port physica. interface. 
Beyond the basic da,, forward.ng aspec.s. .here are different requirements for cel. swi,c hi „ g versus frlme 
forwarding. As hefore ,,,cd. a„ current ,cchno,ngy divides switching elements into three types: bridges, routers and 
sw it chc., and in pnnicu.,, ATM swi.chc, Tr,c distinction between bridges and rou.ers is blurred in tha, both ' 
forward dn.agra.ns and lyP icn., y mos, rou.ers also do bridging functions as thus the discussion focuses on 
datagram switches (i.e. router?:) nnd ATM switches. 

I. is in order firs, ,„ invcs.ica.e the basic architectural requirements for these two types of switching devices 
based on eurren, solu.ion., and ,hcn ,o present the reasons why current solutions do no, provide mechanisms ,o al.ow 
simuhancous .ransfc, of celK and frames wi.hou, severe.y impacing the correct operations of either ATM swi.ching 
or frame forwarding. The novel solu.ion based on ,he present invemion wi.l then be clear. 

Rou.ers .ypica.lv have a wide varie.y of physica. interfaces: LAN in.crfaccs. such as Ethernet. Token ring 

and FDD., and widc.vca in.erface,. such Frame Re.ay. X.25. TJ and ATM. A router has methods for receiving 

frames from these various interfaces, and each in.erface has different frame characteristics. For examp.e. ,„ Ethers 

frame may be anywhere from M bytes ,o . 500 by.es. and an FDD. frame can be anywhere from 64 by.es ,o 

4500 ( i„c. u ding header and trailer) by.es. The rou.ers I/O modu.c s.rips .he header that is associated wi.h lhe 

physical imerface and prcscnls lhe resulting frame such as an IP rf-..,,.,,,- . ,u r 

g .rame. sucn as an IP datagram, lo the forwarding engine. The forwarding 

engine ,ooks a, .He .P dcs.inanon address. Fig. 2 . and makes an appropriate forwarding decision. The resuU of a 
forwarding decision is ■„ send datagram to .he egress por, as de.ermined by the forwarding .ab.es. The egress pen 
.Hen a,,ches ,he .propria, ne.work-dependen, header and .ransmi, .he frame out the physica. in.erface. Since 
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different interfaces may have different frame size requirements, a router may be required to "fragment" a frame, i.e. 
"chop" the datagram into useable size. For example, a 2000 byte FDDI frame must be fragmented into frames of 
1500 bytes or less before being sent out on a Ethernet interface. 

Current router technology offers "best effort" service. This means that there are no guarantees that 
datagrams will not be dropped in a router-based network. Furthermore, because routers transfer datagrams of varying 
sizes, there are no per datagram delay variation or latency guarantees. Typically a router is characterized by its 
ability to transfer datagrams of a certain size. Thus, the capacity of a router may be characterized by its ability to 
transfer 64 byte frames in one second or the latency to transfer a 1 500 byte frame from an ingress port to an egress 
port. This latency is characterized by last bit in, first bit out. 

An ATM switch, by comparison, has only one type of interface, i.e. ATM. An ATM switch makes 
forwarding decision by looking at a forwarding table based on VPI/VCI numbers. Fig. 1. The forwarding table is 
typically indexed by physical port number, i.e. an incoming cell with a VPI/VCI on ingress port N gets mapped to an 
egress port M with a new VPI/VCI pair. The table is managed by software elsewhere in the system. All cells, no 
matter what the ATM Adaptation Layer (AALx). have the same structure, so that if ATM switches can forward one 
AAL type; they can forward any type. 

In order to switch ATM cells, several fundamental criteria must be met. The switch must be able to make 
forwarding decisions based on control information provided in the ATM header, specifically VPI/VCI. The switch 
must provide appropriate QoS functions. The switch must provide for specific service types, in particular Constant 
Bit Rate (CBR) traffic and Variable Bit Rate (VBR). CBR (voice or video) traffic is characterized by low latency 
and more importantly low or guaranteed Cell Delay Variation (CDV) and guaranteed bandwidth. 

The three main requirements of implementing CBR type connections over a traditional packet switch arc 
low CDV, small Delay and guaranteed bandwidth. Voice, for example, consumes a fixed amount of bandwidth, 
based on the fundamental Nyquist's sampling Theorem. CDV is also part of a CBR contract, and plays a role into the 
overall Delay. CDV is ihe total worst case variance in expected arrival time and actual arrival time of a packel/celL 
In so far as an application is concerned, it wants to see data arrive equidistant in time. If, however, the network 
cannot guarantee this equidistant requirement, some hardware has lo buffer data - equal or more than the worst case 
CDV amount introduced by the network. The higher the CDV, the higher is the buffer requirement and hence the 
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higher Delay: and. as illustrated earlier. Delay i s not good for CBR type circuits. 

Packcbascd network, traditionally queue data a, the egress based on priority of traffic. Regardless of how 
data is queued, traff.c with ,„w dela> . varialion rcqu;remen|s ^ ^ ^ ^ ^ ^ ^ ^ ^ 

•hem could be ma.murn packet si,c. and this inherently contributes the most to de.ay variation on , packet-based 



network. 



There are many methodologies used to manage bandwidth and priorities. From , Network Management 
point of view, a „, lwo ,k manager u, u aMy ,ikes to carve ou, the tota. egress bandwidth into priorities. There arc 
scvera, reasons for casing lhij tandwidlh: ,,. it ensur „ |he manager ^ ^ ^ ^ ^ ^ ^ 

Bandwidth) always has room on the wire even during very high line bandwidth utilization, or perhaps a CBR 
(Consiani Bit Rate) traffic will be guaranteed on the wire. etc. 

There are numerous methods to address bandwidth per traffic priority. Broad c.asses of these mechanisms 
are Round Robin Queuing. Weighted fair Queuing and Priority Queuing. Each methodology wi„ be chained for 
-he sake of discussion and completeness of this document. ,„ al, cases of queuing, traf He is pu, into queues based on 
priorities. usua.ly by a hardwa rc engine thn, ,ooks a, a ccH/packe, header or contro. information associated with 
ccH/packe, as the cel./packe, arrives from the backbone. „ is how da ,a is exacted/dequeued from these queues that 
differentiates one queuing mechanism from another. 

Simple Round Robin Queuing 

IT* queuing mechanism empties all queues in a round robin fashion. This means ,h„ traffic is divided into 
queues and each queue gets the same fixed bandwidth. Whi.e a dear advantage is simplicity of implementation. , 
major disadvantage of this queuing technique is tha, this mechanism complete.y ,oses the concept of priority. Priority 
must then be managed by buffer allocation mechanisms. The only c.car advantage is simplicity of indentation. 
Weighted Round Robin 

This queuing mechanism is an enhancement of Simp.. R„ und Robin Queui „ 8 , where . ^ fc ^ 
on each queue by the network manager during invalidation time. ,„ this mechanism, each priority queue is serviced 
b-ed on the weigh,. ,f one queue is al.ocated ,0% of the bandwidth, i, wi„ be scrviced l0 * of lhe tim , ^ 
queue m,v ha, 50* of the allocated bandwidth, and wi„ be serviced 50* of ,he time. The major drawback is .here 
« -used bandwidth on the wire when there is no traffic in a queue of the allocated bandwidth. This results in w«,ed 
«™Md.h. There is. moreover, no association of p3f ke, si« in the dequeuing a.gorithm. which is crucia, for 
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packed ,,i,c hc , Gi vi„ g e„u,, wcieh, ,o a„ packe, si.cs ,hro., off .he bandwid.h aIloc3tion scheme 

Priority Queuing 

«» .his q ucuinc .ncch.nnism. ou.pu, queues are serviced purely based on priori.y. Uc Hi g hes. Priori.y 
Queue f «„ ser, iced Hrs, and ,hc Lowes, Priori.y Queue g e,s serviced .a S , ,„ .his mechanism. H|?hcf ^ 
Trafr.c always precmp.s ,he Lower Prior ily Queue . The drawback of .his type of mechanism is ,ha t ,he Lower 
Priori,. Mcch,nism may re.,, in ,cro bandwid,, T*e advance of , hi s mechanism, besides bei„ g simp.e. is ,ha t * 
bandwid,,, is no, w„,ed: so ,on g a, ,hc,e is da, ,. send, i, wi„ be sen, There is. However, no associa.ion of pac kel 
in ,h, de. queU in g a, f ori,b„, w, ich is erueia, for pac.e.-based swi.che,. Giving equ a, w eig h, , 0 a „ pae ket ,i«s 
throws off the bandwidth allocation scheme, as before noted. 

P.om ,he above oamp.cs. ,hc,e is a need ,o strike a ba.ance be.ween Priori,, Qucuing and Weigh , ed 
Round Roh ,n g uculng . ,, onp , iln pDckc , ^ ^ ^ fa , ^ ^ ^ ^ ^ ^ 

...oca-ion. ,„ addilioB ,o .He above require™, Ihe ou , P u, buffer should be HUcd w ilh dala from a qucU£ _ when 
•he ban 0 .,d,h of ,ba, q ueuc is inc.udin, wi,H o.ber band.id.b e, ig ib,c queue da.a. Tbis .echnique 

enforces bandwid.h per .rafr.c queue require™, and a,so does no, , as ,e bandwid.h on ,be wire and is embodied in 

the invention 

Architectural Issues in Switch Design 



are 



Curren, swi,chin g so.u.ions emp.oy ,w 0 dis.inc, so.u.ions: I) memory and 2) cross-bar. These so.u.ions 
i- H E s. 5 and 6 ,how ing a .radi.iona. bus-bascd and memory based archive, and in Kg. 7. showing . 

traditional crossbar switching architecture. 

In -he .radi.inna, memory-based so.u.ions represent bv Fig. 5. d,„ mus. firs, be p.aced inside of main 

~, Since severa, d.fferen, *> .odu.es mus, Ma da.a , common memory, con.en.ion for .his reso„rce 
occurs. Main memory provide., bo.H a buffering mechanism and a .ransfer mechnnism for d,a from one physic,, 

ability of ihc system to move data in nnd out of r 



l mam memory nnd the number of interfaces that 



must access mam 
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memory. 

As more fully shown in Fig. 6. .he CPU interfaces through a common bus. with memory access, with a 
plurality or da.a-receiving and transmitting I/O ports #| . «. etc.. with the various dolled and dashed lines showing 
the interfacing paths and the shared memory, as is »cll known. As pointed cm previously, the various accesses or 
the shared memory result in substantial contention, increasing the latency and unpredictability, which is already 
substantial in .his kind of architect because the processing of the con.rol information cannot begin until Ac entire 
packet/cell is received. 

Furthermore, as the accesses to the shared memory arc increased, so does the contention; and as the 
contention is increased, this results in increasing the latency of the system. In the traditional memory-based switch 
data now diagram of Fig. 6. thus, where the access lime per read or write to the memory is equal lo M. and the 
numher of bits for a memory access is W. ihe following functions occur: 

There is the w,i, e of data from the receive port # I to shared memory. The time to transfer a packet or cell is 
equal to ((B-gjAVj-M. where D is cqua. to the number of bvtcs for the packet or cell. M is the access time per read 
or w rilc to ,he memory and W i< ,,, c numhcr of bi(s for a memcr) . ^ ^ ^ ^ ^ ^ ^ ^ ^ 

to write it to memory. 

This means that if a packet is destined lo an ATM interface as in Fig. 5. followed by a cell, the cell is 
delayed by the amount of transfer time from main memory, and in ihe worst case this could be N packets (where N is 
the number of packet. non-ATM interfaces) including the contention among other reads and writes on the bus. If. for 
example. B=4000 by.es and M is 80 nanoseconds (for a 64 bit-wide bus for DRAM access), then ((4000 • 8y64) • 
80 = 40.000 nanoseconds for a packe. .ransfer queued before a cell can be sent, and OC 48 is 170 nanoseconds per 
64 byte cells. This is only if there is no contention on the bus whatsoever. In the worst case, if a switch has 16 port, 
•nd ,11 the ports are contending simultaneously, then to transfer the same packet would require 640.000 nanoseconds 
just to get into the memory, and the same amount to get out- a total time of about 1.3 milliseconds. This occurs if 
between each w, ilt in ,o memory, another nor, has to write to memory as well. So for n=!6 ports. n-I.or 15 ports 
have to gain access to memory. This means thnt 15 ports • 80 nanoseconds = 1200 nanoseconds ar^sed by Ihe 
system before the next transfer into memory of the original port can occur. Since there are '4000 bytes • 8 
hi.s/by,c>/64 bits = 5,10 accesses, each access is separated by 1 200 nanoseconds, and the full transfer takes 500 ♦ 
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1 200 = 600.000 nanosecond*. So the total is system time plus actual transfer lime which is 600.000 nanoseconds ♦ 
40.000 nanoseconds = fiJO.000 nanoseconds for the transfer into memory, and another 640.000 nanoseconds out of 
memory. This calculation, moreover, docs not include any CPU contention issues or delay because of egress port 
busy, which would make this calculation e\cn larger. 

There arc similar disadvantages in traditional cross-bar based solutions as shown in Fig. 7. before 
referenced, where there is nr. main memory, and buffering of data occurs both at the ingress port and egress port. In 
the memnrv-hased design of Figs. 5 and 6. buffer memory is shared across all pons, making for very efficient 
utilization or memory on the switch In the cross-bar approach of Fig. 7. each port must provide a large amount of 
memory, so that the overall memory of the system is large as there is no common sharing of buffers. The cross-bar 
switch is only a conduit for the transfer of data from one physicalport on the system to another physical port on .he 
system. If , wo ports „ e <„nul,anen u sly lo transfer data to one output port, one of the two input ports must buffer the 
data thereby increasing the latency ami unpredictability as the data from the firs, input port is transferred to .he 
output port. The advantage of a cross-bar switch over a mcmcry-based switch, however, is the high rate of d.U 
.ransfer from one point to another without the inherent limitation of main memory contention on the memory-based 



switch. 



In the traditional cross-bar switching architecture system of Fig. 7. the CPU interfaces through a common 
bus. with memory access, to an interface with the various dotted and dashed lines of Fig. 8 showing .he interfacing 
paths and the shared memory, as is well known. Tl,e CPU makes a forwarding decision based on information in .he 
data. The data must then be trans.n.tted across the cross-bar switch fabric to the egress por.. But if other traffic is 
being forwarded to ,ha, egress interface, then the data must he buffered in ,he ingress interface for ,o long as the 
amount of time it takes to .ransfer the entire cell/packet .0 the egress memory. There is: 

A. Write of data from the receive por. • I U, local memory. The time to .ransfer a packet or cell is equal 
«o «B«R VWJ'M. where D is equal .o the number ofby.es for .he packet or cell. M is the access time 
per read or write m the memory and W is ,hc number of bi.s for a memory access . As .he packet gels 
larger so docs the lime .o write it to memory. 

B. Write of data from the receive port « I ,o local memory of egress por. «. The .ime .0 transfer a 
poke, or cell is equal to ((B««)AV,. M + T. where D is equal .o .he number of by.es for the packe. or 
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cell. M is the access time per read or write lo the memory. W is the number of bill for a memory 
access and T is the transfer time of the cross-bar switch.. As the packet gets larger, so does the time to 
transfer it across the cross bar switch and write it to local memory. 

for a packet transfer followed by a cell transfer to an egress port, the calculation is the same as for the 
memorybased solution of Figs. 5 and 6. The packet must be transferred to local memory a, the „me speeds as for 
.he memorybased solution The advantage that there is no contention for central memory, does no, alleviate the 
problem that a packet transfer in front of acel. transfer can cause delays that prevent the proper functioning of very 

fast inierface speeds. 

The goal is to create a switching device running a, high speeds (i.e. SONET defined rates) that provides the 
required QoS. The device should be scalable in terms of speed and pons, and the device should allow for cqua.-.ime 
transfer of cells and frames fro™ an ingress port lo an egress. 

While current designs have started to come up with very high speed routers, they have not. however,been 
able to provide all the ATM service requirements, thus still maintaining a polarized set of networking devices, i.e. 
routers and ATM switches. An op.i.nal solution is one that achieves very high speeds and tha, provides the required 
QoS support and has interfaces tha, merge ATM and Packet-bascd technologies on .he same interface. Fig. 3. This 
vil. allow the current investment in either networking technology to be preserved, ye, satisfy bandwidth andQoS 



demands. 



The issues in n.crging interfaces on a data switch port tha, accepts ATM cells and treats certain ATM cells 
as packets and others as ATM How, accepts only packets on other interfaces and only cells on yet another se, of 
interfaces, is shown in later^iscusscd Fig. 4. These issues are three fo.d: a) Forwarding decision a, ,he ingress 
interface for packet and cells, b, s. itching packet and cells through the .switch fabric and. c) managing egress 
bandwidth on packe. and cells The present invention, based on this technique of the previous* cited co-pending 
.pp.ica.inns. explains how ,„ create a genera, data switch tha. merges the two technologies (i.e. ATM swi.ching and 
paeket switching) and solves the three issues listed above. 

Inierface Issues Switch Designs 

The purpose of this section is to compare and contrast ATM and Packe.-based switch designs and various 
interfaces on either type of s.itch design. Specifically i, identifies prob.ems with bo.h devices as thcy pertain «o forwarding 
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packet, or cells: i.e. issues with ATM switches forwarding packets, and issues xvith Packer, smirches forwarding cells , Rg . 
3. 

Typical Design of an ATM Swiich 

As previously explained, defined within .he ATM standard .here arc mul.iplc ATM Adaptation Layers ( AAL I- 
AAL5) . each one specifying a diffcren. type of service from a wide spectrum of services: namely. Cons.an, Bit Rate (CBR) 
to Unspecified Bit Rare (UBR». Constant Bit Ra.e (AALI) con.ract guarantees minimal cel. loss with low CD V. while 
Unspecified Bi. Ra.e contract specifies no traffic parame.ers. and no qua.i.y of Service guarantees. For ,he purpose, of this 
invention it is convenient to limit the discussion to AALI (CBR) and AAL5 (Fragmented Packets). 

F.g. 9 illus.ra.es cell switching with narive cell interface cards, showing different modules of a generic ATM 
Switch with naive ATM interfaces. The cells aniving from the physica. .ayer module (PHY) are processed by . modu Ie 
called Policing Function Module, which validates per VCI established contracts (services ) for incoming cells; e.g Peak 
Cel. R„ e . Sustained Cell Rate. Maximum Burs, Ra.e. Other parame.ers such as Cel. Delay Variation (CDV, and Ce.l Loss 
Rate (CLR) are guarantees provided by the box based on the ac.ua. design of the cards and the swi.ch. The contracts are set 
by the network manager or via ATM signaling mechanisms. Cel. Data from .he policing function then goes, in the 
example of Fie. 9 ,o a Cross Bar-tvpe (Fig. 7, or Memorybased Swi.ch (Fig 5). Cells are then forwarded to the egress port 
which has some requirements of shaping traffic to avoid congestion on the remote connection. To provide egress shaping, 
.he design wil, have to buffer data on the egress side. Since ATM connections are based on a point-to-point basts, the 
Egress shapcr modu,e also has to trans.atc the ATM Header. This is because the next hop has no relationship to the ingress . 



VCI/VPI. 



Na.ive Packet Interface on ATM Swiich 

As mentioned in the "Background' section, if an ATM switch is to provide a method tha, faciiitatcs the routing of 
packets, there have to be a, .east two points between ,w 0 hosts where packets and cell, networks meet. This means that 
current cel. switching equipment has to carry interfaces tha, have native packet interfaces, unless the switch is sitting deep 
in the core of the ATM network. „ ls novv in order , MkAq ^ ^ ^ ^ ^ ^ ^ ^ 
to the ATM switch. 

A typica. Packet interface on an ATM Switch is shown in Fig. I0 . e.abora.ing on packet interface on the ce.l 
switch. The phys.ca, interface w 0uId pu, incoming packets into a buffer and then thev arc fed to the "Header Lookup and 
Forwarding Engine". The packefbased forwarding engine decides the egress port and assc.ia.es a VCI number for ce.ls of 
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that packet, pocket then gets segmented into cells by the Segmentation Unit. From there on. the packet is treated just as 
in the native Cell Switching c*e. which invo.ves going through a policing function and to the Switch Buffer before entering 
the switch. On the egress side, if the cells enter a cel. interface, then the processing is just as explained above (in the native 
cell interne on ATM ,w llc h, If the eel, enter a packet interf.ee. then the cells have to be reassembled into packets. TW 
packets nrc then put into various priority queues and then emptied as in the packet switch. 

Two types of packet interfaces on the ATM Switch should be examined. 
AAL5 Interface on ATM Switch 

A R 0U ,c r connected ,o ATM S-kh could segment packets before sending ,he packet ,o the ATM Switch, m ,ha, 
case, packets would . Bi ve a, ,he ATM Switch in AALS forma,, before described. If the ATM Switch were to act as a 
Ro U ,er and on ATM Switch, i, would have ,. reassemb,e the AALS Packet and perform a routing decision on i, Onee the 
ATM Switch/Router makes the folding decision on the AAL5 packet, i, would then push i, through the ATM Switch 
after segmenting it again. 

In AALS. perfect interface on an ATM Switch is shown in Fig . , , . Incoming A AL5 cc„s are firs, po.iced on a per 
VC. based ,o ensure th, the sender is honoring the contract. Once the policing function is done, an Assembler will 
assemh,e the cells of a VC. into packets/These pockets are then forwarded to the forwarding engine, which makes the 
forwarding decision on the assembled packet and some routing algorithm. The packet then trave.s the ATM Switch „ 
mentioned in the Packet Interface on ATM Switch section, above. 

Difficulties in Processing Packets on Cell Switch 

Keeping the goa. of the present invention in mind. i.e. to achieve strict QoS parameters such as CDV and latency 
nnd packet ,o<s. „„s section wil, „„ „, d.fficulties of attempting to design for packets through a traditiona, cel. switch. 

According to Ftg. I ,. once the incoming AALS segmented packets are assembled and a forwarding decision is 
made, they are .segmented in the "Segmentation Unit". Across the Switch, the AALS «... are then reassembled i n ,o 
packets before thcy are shipped on the egress wire. This segmentation and reassembly adds to ,he delay and unpredicubte 
3nd .measurable PDV (Packet Delay Variation, and ce„ loss. As earlier mentioned, for packets to be provide QoS. i, 
wou,d need to support contract „,, includes providing measurable PDV and delay. Delay is caused due ,o the fact the CC..S 
have to be reassembled. Each reassembly wou.d have to. in best case, buffer an entire packet wonh of data before «* h| i, 
ccm P ,e,e and sending i, ,o the QoS section. For , 8000 byte packet, for examp.e. this cou.d result in 64 usee delay in 
buffering on a I Gigabit switch. 
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Tne PDV f„, .-, packet ihrough a cell switch is even more of a concern than .he addi.ional delay. The assembly 
process can be processing multiple packets at the same time from various ingress ports and packets, and this causes an 
unpredictable amount of PDV. essentially based on switch contention and the number of retries of sending cells from 
ingress in egress. 

Cell loss through the switch causes packets lo get reassembled incorrectly and therefore adversely affects 
applications .hat a/e real-time content specific. Most file transfer protocols do recover from a dropped packet (due lo 
dropped cells), but it causes more traffic on the switch due to retransmissions. 

In summary, passing packets through an ATM switch does no. provide packets with the same CDV and latency 
characteristics as cells. I. simply provides a mechanism for passing a packet path through a cell switch. 

Design of Packet Switch 

A tradiiional Packet Switch is shown in Fig. 1 1 with native packet interface cards. Packets are forwarded to the 
Forwarding Engine via the physical interface. The Forwarding Engine makes a routing decision based on some algorithm 
and the header of the packet. Once the egress port is decided, the packet .ravels to .he egress via the Packet Switch, which 
could be designed in one of many way, (e.g. N by N busses, large central memory pool. etc.). On egress, the packets end up 
on different traffic priority Queues, llrtsc Queues are responsible for prioritizing traffic and bandwidth management. 

Cell Interface on Packet Switch 

The traditional packet switch, shown i„ Fig. 13 with AAL5 interfaces, provides a mechanism to allow cells to pass 

through the box so long as the cells are of AAL5 type. There is no practical way of creating a virtual cell switch through a 

traditional packet switch, and part of the present invention deals with the requirements of such an architecture. 

After AAL5 cells are policed for contract agreements, they are assembled into packets by an Assembly module. 
The packes thus created are then processed exactly like native packet interfaces. On the egress side, if packets have to go 
ou, of the Switch as AAL5 cells, .hey are firs, segmented and then header translated. Finally ,hey are shaped and sen, out. 
Difficulties in Processing Cells on Packet Switch: 

There are problems thai a cell flow faces as it traverses a traditional packet switch. J. is extremely difficult Tor a 
traditional data switch, such as a router, to support the QoS guarantees required of ATM. To illustrate the point, reference 
is made to the diagram shown in befnrcdescrihed Ftg. .3. One of the biggest challenges for a packet switch is to suppon 
AALI cells. The simple reason is .hn. the traditional Packet-based header Lookup and Forwarding engines do not 



• 16- 



WO 99/35577 PCT/1B98/01940 



simultaneous rccogni:e cell, and packets: therefore. AAL5 ccl.s which con be converted i„,o packets are supported. This is 

3 severe restriction in the capability of the switch. 

Among the features of cells, ore the CDV and the delay characteristics. Pushing «... through , traditiona. packet 
switch adds more delay and an unpredictable CDV. Ue packet switch, as is inherent in its name, implies «h„ packets of . 
various si,cs and number, arc queued up on the switch. Packetized cells would then have no chance of maintaining any , ype 
of reasonable QoS through the switch. 

Preferred Embodiments of the Invention 

The present invention, cxemplarily illustrated in Figs. 4 and 14. and unlike all these prior systems. 
optimi.es the networking system for transmitting both cells and frames without internally converting one into the 
other. Furthermore, i, maintains the strict QoS parameters expected in ATM swi.ches. such as strict CDV , al£ncy 
nnd cell loss. TM, is achieved by having a common ingress forwarding engine that is context independent, a switch 
fabric ,h„ transfers cells and frames with similar .atency. and a common egress QoS engine- packets flowing 
through the architecture of the invention acquiring cel. QoS characteristics while the cells stil, maintain their QoS 
characteristics. 

The main components of the nove. switch architecture of the invention, sometimes referred to herein by ,h= 
acronyn, for the assignee Herein. "NcoN." as shown in Fi? . 14. comprise the ingress par,, the switch fabric and the 
egress par,. TV ingress r ar, is comprised of differing physica. interfaces tha, may be cel. or frame. A ceU interface 
furthermore may be either pure cell forwarding or a mixture of cell and frame forwarding where a frame is comprised 
of a collection of cells as defined in AAL5. Another pan of the ingress component is the forwarding engine which is 
common both cells and frames. The switch fabric is common to both cc.,s and frames. The egress QoS is also 
common to bo.h cells and frames. The fina, pa,, of egress processing is the physica. .oyer processing which i, 
dependent on the type of interface. Thus, the NcoN switch architecture of the invention describes those pan, tha, « 
common to both cell and frame processing. 

The key parameters required for ATM switching. ,s earlier explained, and ,ha, are provided even in the 
case of simultaneous packet switching are predicab.e CDV. .ow Latency, low Ce „ Loss and bandwidth 
-nagement: i.e. providing a guaranteed Peak Ce.l R „ e ,P CR ). Ue architecture of the invention. Figs. 4 and .4. 
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however, covins .wo physical interfaces AAL5/I and packc. interface at the ingress and egress. The difference 
between the two types of interface is the modules listed as "Per VC Policing Function" and "Per VC Shaping". For 
cell interfaces (AAL1-5). the system has to honor contracts set by the network manager as per any ATM switch and 
also provide some son of shaping on per VCl bases a. the egress. Besides those physical interface modules, the 
system is identical for a packet or a cell interface. The system is designed with the concept thai once the data 
traverses the physical interface module, there should be no distinction between a packet and cell. Fig. 14 lists ,„ e 
core of .he architecture which has three major blocks, namely. "Header Lookup and Forwarding Engine". "QoS". 
and "Switch" fabric, that handle cells and packets indiscriminately. The discussion, as it relates to this invention, lies 
in the design of these three modules which will now be discussed in detail. 

Switch Fabric 

The inventions presented in before-referenced co-pending U.S. pa.cn, applications Serial No. 581.467. and 
Serial No. 900.757. both of common assignee herewith, optimize the networking system for minimal latency, and can 
indeed achieve ,ero latency even as data rates and port densities are increased. They achieve this equally well, 
moreover, for either 53 byte cc.ls or 64 hyte to 64 K bytes packets through extracting the contro. information from 
.he packet/cell as it is being w,i„en into memory, and providing the control information to a forwarding engine 
which will make switching, routing and/or filtering decisions as the data is being written into memory. 

Native Cells through the Switch 

The switch cells (AAL./5) of Fig. 14 are first policed a. 2 as per ,he contract the network manager has 
installed on a per VC. base. TV,s module cou.d also assemble AAL5 cells into packets on selected VCl. Coming out 
of the policing function 2 are either cells or assembled packets. Beyond this juncture of the data Row. .here is no 
distinction between a packe. or a cell un.il the data reaches the egress P on where data has to comply with the 
imerface requirements. The cells are queued up in the "NeoN Data Switch" 4 and the cell header is examined for 
destination interface and QoS requirements. This information is passed on to the egress interface QoS modu.e 6 via a 
Contro. Data Switch, so-.abe.ed a. R .The QoS for . ce.Uypc interface will simply ensure that cell rate's beyond the 
Peak Cel. Rate are clipped. The cells arc ,hcn forwarded to the "Per VCl Shaping" modu.e 10. where the cells .re 
forwarded ,., the physic*, interface after thcy „e shaped as per the requirements of .he next hop switch. Since the 
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QoS mod.e 6 docs no, k no, f rom ,e con.ro, d3l3 whclher . packc , „ , ^ , ^ ^ ^ ^ ^ ^ 

h ca dcr „ nsl3!lon , f i( , crc 3 eelI going inl0 anolher vc , iunnc , and/of jcgmcniaiion . f ^ ^ ^ ^ ^ 

out on a cell i„ !e ,r a ce and/nr reform shaping * per ,hc remote end requirements. 
Native Packets through the NeoN Switch 

Engine mod ult „.,i lc lhe daI3 h scn , to (hc NeoN da,, Jwi(en 4 . ^ lngress Forwardjng Engine ^ ^ 
r-r-inr -Con aoo U , „ lc Q oS 3nd , he dcslinalion ^ ^ ^ ^ ^ ^ ^ ^ ^ 
K~.uK-, E n f inc ,« also ga.ncrs a,, inrorma.ion regain, the d31 , packel . ^ NeoN S.i.eH a dd ,e„. Pack e, QoS 
E.ess Mender trans.a.ion informs, and S e„ds i, across ,o ,He e g ress in.erface e*,. TO , information „ ^ „ 
3 COn,f0! P - 1CkC ' '° " " "* «««-. da,a sw itch S ,o , c E? ress QoS mod.e 6 

<o egress ■„ 3 cc„ in , crf ace. , hc wi „ be se?menIe , [hcn ^ ^ ^ ^ ^ ^ ^ 

interface. 



were 
cs the 



Advantages of the NeoN Switch Architecture of the Invention 

As seen aW c . cel, and p3c ,e, Ho. ,Hr„ U g h lhc box wjlhout any d ist!nclio „ „ ^ ^ 

— . s U cn irce„ cnarace.is.ics are main tai nc d . .Hen pac kelS nave tne same cnarac.eris.ics a, , he ce„ S Toe 

P.u, ,,, IhlK tave mcasurahlc ,„ d low pDV (packc| Dcby Var . a|ion) ij(ency ^ 

zoning P3 c ket , wilcning with ce „ characlen ,, icj yM inierfac;n? (o exis(in? eei( ^ 

th ro U?h a „ adilion3l ATM S», ch a, so suffer lhc Slme long de| , ys unpre(Jictabic ^ _ ^ ^ ^ 
^«'--^ 

P,ssiMe inc lud e ,He , w Podding Engine. ,Hc Egress QoS, a nd lhe Switch Fabfic 
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Ingress Forwarding Engine Description 

TV Purpose nf , lle ln?fc ,< Forwnrding Engine , , pig „ . ^ ^ ^ ^ ^ ^ 

predefined criteria and contend , )f Ihe f rnme/ctI1 . make , ^ ^ ^ ^ ^ ^ _ 

corned against i.c m , mo icJ in memory. ,f o match is defined. then the contents of the memory location 
provides commands for actions on the cell/frame in qu es lio „. The termination of , he seaxch. which „ ,„ 
Process, rcsu.ts in a forwarding decision. A folding decision is a declination of how , 0 process ,hc 
aforementioned framc/cc,,. Such processing m3y i„c, u de cou„,i„ e statistics, dropping the frame or cel.. or sending 

of four tracers , sho . n h c d e . ^ h3ve ^ ^ ^ ^ ^ ^ 

input producing a P oi„,cr ,o the ne« character. The fina. character b produces a pointer ,o a forwarding entry. A 
different stream of charace, , han that n.ustrated wouId havc , ^ fa ^ 

different results. 

The proposed Ingres* Forwarding Engine ,4 is defined ,o be a Parsing Micro-Engine. The Parsing Micro- 

•ogic tha, fo.,ows instructions written into the passive memory component which is composed of two majo r storage 
-ions: , ,Pa,se Graph Tree ( POT, Pig. , 8. and 2, Forwarding Lookup Tab.e (FLT), Fig. , 9. and a minor storage 
section for statistics coNcction l„c Parse Graph Tree is storage area that contains a„ the packet header parsing 
information, the rcso.ts of which is an offset in the Forwarding Lookup. TV FLT contains information about the 
destination pon. muhicas, inflation, egress header manip U ,a,ion. The design is very nexib.e. e.g. in a datagram. U 
can traverse beyond the OA and SA fields in the packet header and search into the Protoco, fie.d and TCP Port 
number, etc. TV proposed PGT is memory tha, is divided into the V b.ocks with each b.cck having 2 » elemtnlJ 
(where men). Each element can be one of three types, branch element. ,caf element, or skip e.emen, and within 
each block, there can he any combination of element types. 

Whi,e panicky uscM for the purpose of the present invention. ,he Parsing Micro-Engine is generic from 

J, ' 1ndr ° ^ " " nmln " " ^ °< K» - decisions based on tha, comparison. This 

- be a P p,i cd . f or e,am P ,c. ,n any ...searching functions, searching for certain arbitrary words, .n such 
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applications, as an illustration, words such as -bomb" or "detonate" in a | c „ e , or email may be searched and if , 
match is detected, the search engine may then execute predetermined functions such as signaling an alarm. In fact 
the same memory can even be used to search for words in different languages. 

In the context of the invention. Fig . 4 illustrates having ,w„ entry points. One entry point is used to search 
for ,„, in one language, while the second entry point is used to search for tcx, in another language. Thus the same 
mechanisms and the same hardware arc used for two types of searches. 

There are ,wo components to the datagram header search, software component and the hardware 
component. The software component creates the elements in the Parse Graph for every new , ou ,e i, finds on an 
interface. The software has to create a unique graph staxting from , Branch Element and ending on a Leaf Element, 
later defined, for each additional new route. The hardware walks the graph from branch to Leaf Element, clueless 

ahoul Ihc IP header. 

In fact there can be many entry points in the memory region as illustrated in Fig. 21. TV initial memory 
can be divided into multiple regions, each region of memory being a separate series of instructions used for different 
applications. ,„ lhe casc of Fig . 22 . onc ofjhc , ef jon , ^ ^ Jp ^ ^ ^ ^ ^ ^ ^ 

ATM forwarding. At system sta,, the memory is in zed to point to "unknown route", meaning that no forwarding 

information is available When a new entry is inserted, the structure of the Lookup Table changes, as illustrated in 
Fig 23. Uc illustrative IP address 209.6.34.224 is shown inserted. Since this is a byte-oriented lookup engine, the 
nrst block has a pointer inserted in the 209 location. The pointer points to a block that has a new pointer value in the 
6 location, and so on until .„ of the 209.6.34.224 address is inserted. A.I other va.ues still point to unknown route. 
Inserting the address in the IP portion of memory has no impact in the ATM portion of memory. 
As mentioned earlier, there are 2" blocks each with 2" elements in the parse graph tree. structure of each element 
is as shown in Fig. 1 7. with each element having the following fields. 

1. 1n,ruc,,on Field: In the current design there are three instructions resulting in two bi, instruction field. The 

instruction description is as follows. 

• Branch Element (00). ,n so far as the Micro Engine is concerned, the branch element essentially point, 
•he Forwarding Engine to the n„, block address. Also, within the branch element, the user may set 
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-ious field, in -he -Incr^enttl Forwarding Info PicI(J , Rg , ^ ^ ^ ^ 
c«.^x events or the fin,, Forwarding Inform3(io , Fw ^ jf ^ ^ ^ ^ ^ 
» IP he**,. and lht branch elcmcnt was p|aced 3| (hc enJ of ^ dcstiM|jnn neMi (hen ^ ^ 
-Pd* -he egrc.v, pnfl r,Cd of .he forwarding inf0 . For ATM Aching. lhe Uicr ^ ^ ^ 
caress port information at the end of parsing the VPf field. 
• Leaf E.emcnt ,0, ,. TOi element instructs lhe end Qf p . vt . ng w micro tn?;ne ^ 

information accumulated during the search is «hen forward t0 lho n „, ^ ^ . r ^ ^ 

packet header depend, on .ho number of b.ock addresses ,hc micro engine has ,o ,ook U p. No, every 
•He- *« n.icro engine wo., ha, ,„ keep hopping on non-signiHcan, fie* of.he incoming stream 
datagram and continue .he search. The skip si* is described below. 

the protocol, etc. 

>• n„,.,„,„, ,„,„ Field: „^ ,„,„„„,„, ^ 

«— », ta «, »,„„„„ „ tM> ^ Forwjrdtas ^ ^ ^ te|M< 

-* «- .* o, ,„,„,„ niM , ^ ft irimsts ^ ^ 

.N*. M m „. ,„ ^ ^ ^ teW[j terf ^ ^ ^ 

** "* * **• ' — — ■ - « — l~ o» , te T0 S n, M . A,„ to 
»y ht tot ATM „„„ ^ „„ M h faMid ^ ^ vpi ^ ^ >uiiifa 

count could be decided based on the VCI At ih» „, • • .. 

VCI. As .he pvsing „ donc . lhcfef0fCjVariouj piecej ef |he forwardjn? 

information are collected, and when a leaf node is , e »r„ f rf , k , . , 

node is reached, the resulting forwarding information is passed on to 
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-he con.rol path TV width of , he incremental forwarding informal (hereafter referred to as IFI) should be 
equal in ihe number of mutually exclusive incremental pieces in the forwarding information. 

4. Next Block Address Field: This Held is the nex, block address to lookup after the current one. The leaf node 

instruction ignore* this field. 

5. Statistics Offset Field: In data switches, keeping flow statistics is as crucial as (he switching data itself. 
Without keeping flow statistics i, would be difficult. M best, to manage , switch. Having this statistics offset 
field allow* one to update statistics n, various points of the parse. On an IP Router, for example, one could 
collect packet count on various groups of DA. various Groups orSA. all ToS. various protocols etc. In another 
example dealing with an ATM » w!lc h. this field could allow the user to count cells on individual VPI or VCI or 
combinations thereof. Ifthe designer w.-,„ ts to maintain 2' counters, then the size of this field should be s. 

6. FLT Offset Field: Tins « nn offset into the Forwarding Lookup Table. Fig. 18. later discussed in more 
detail. The Forwarding Lookup table has all the mutually exclusive pieces of information that is required to 
build the final forwarding information packet. 

Reference Hardware Design Example 

The following is an example of a hardware reference design for the parser useful with practice of Ihe 
Present invention. The reference design parser has .storage that contains the packet/cell under scrutiny. This storage 
element for the cel./frnme header information is to be two levels in depth. This creates a two-s.age pipeline for 
header information into the delation ,ooku P stage of the Ingress Forwarding Engine. This is necessary because the 
Ingress Forwa,d,ng Engine w,„ „,„ he ah.c to perform a lookup until ,.,c entire header information has been stored 
due to the flcxiMe starting point capability. The two stage pipe.ine ,..„w S lbe , ng ress Forwarding Engine ,o perform 
a lookup on the present header information and also stores the next header information in paraHel. When the present 
header lookup is complc.cd. then the next header lookup can proceed immediately. 

The storage element stores a programmable amount of the incoming bit stream. As an example, the 
configuration may be 64 bytes for IP datagrams and 5 bytes fo, cel.., For an interface that handles both ee.ls and 
frames, ihe maximum of these two values mny be used. 

A DMA Transfer Done signal from each DMA channel will indicate ,o a state machine that i, can begin 
snooping and storing header information from the Ingress DMA bus. A packet/ce.l signal win indicate that .he 



-23- 



WO 99/35577 



PCI71B98/0I940 



header ,0 * wwed 1S e.thcr a p3ckcI llc3der or a cc| , h „ dcr Whtn hcnJcr ^^^^^ ^ ^ 

from a DMA channel, a request lookup will be asserted. 

Fnr header l™k lip , .here he a ,c g is.c, b ased .able which „„ indi „ lc lo |he ln?rcjs 
Engine ,he ,nok up s ,, llnp pni c ,„ „ cadcr The Ingress ^ ^ ^ ^ ^ 

number ,o in dc, , hI c. ,h, inMn a„o,< ,he .ngrcss For,ar ding Enginc to star , fc ^ ^ ^ fa 

•He ,P header „r r,e, ds coined e dwa P or, io „ of ,he packe,. capa bi ,i, y . along wj|h (he skip ^ ^ 

coined. ,„o. ,hc lafmt Pn _ (lillg En?!nc ,„ my ^ ^ ^ ^ ^ ^ ^ ^ 

filtering casc^ per inicrface.. 

A su,,h, hardware lookup is shown ,„ Pig. I9 using . Parsc Trec Graph ^ ^ ^ 

f-cwrdin, deefcin.. Thi, a.gor,,,,. cither a nibhle or a b) , e „ a llmc of ^ an , p ^ ^ 

VP^C, header. TW, capability i, pr. P n,„n»b.e hv ,„ Rware . Each lookup ^ , unique ^ ^ ^ 
« Pointed ,„ h>. one o< , slC e„ originating nodc , onc pc[ jn(crfacc . ^ oijpin;j ^ ^ ^ ^ _ 
programme registered ,ab,c. a„ 0 , inc sofl ., fe l0 buiId lhcS£ „ „, ^ ^ ^ ^ 

. branch nodc. If i. an cnd nodc ., lcn lllc lookup „„„ „ ^ ^ ^ ^ ^ ^ 

clock enables and mux controls to activate. 

result of thc Parser lookup is the Fonvardine Table lookun uhieh S, , k u r 

fc juic lookup which is a bank of memory yielding the 

Forwarding ID Held will he u , cd in Jevcra , wa Fj . 

=->s r,r«. .he MSB (Most S.gnifican. By.c) of .he Held is u,ed .o 

indicate a uni»« or rmiliica<i packet m the neiwnrk interface Irvrl r 

mterface Icxcl. For muliicasi packets. („ example, the Egress 
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Q U cuc Manage, wi„ necd lo lnok , , M , „., f0f qucujng of mu , . caM racke(s |o mui . pic in(eff3ces ^ 
packets, fo, examp.e. si , hiK of thc FofW3rdjng , D „ n ind . c3ie (he dcM . M ( . on jn|effjce numbef 
.6 bits wi„ provide , La.vcr 2 ID . ^ Layef 2 , D wH| be used by |he Egrcs$ Forwafding io?jf iq detefmine ^ 
Layer 2 header need, ,n be prepensed ,o the packe, d„a. For packets, these headers wi„ be adde d ,„ lhe paeke , as „ 
is moved from ,hc Egress DMA FIFO (fir,, in. firs, out, ,o ,hc Egress Duffer Memory. For ce.l, ,he Layer 2 .D wil, 
provide the transmit device with the appropriate Channel ID. 

For unicas, traffic, the Destination l/F number indicates the network destination interface and the Layer 2 
ID indicate, what type of La.ver 2 header need, to added onto the packet data. For mu.ticast. the mu.tieas, ID 
indicates both the type of Uy«, 2 header addition and which network interfaces can transmit the mu.tieas,. The 
Eg «« Queue Manner win pc,fo,rn a Multicast .D tab.e .ookup to determine on which interfaces ,he packe, wi„ ge , 
transmitted on and ul.n, kind of Layer 2 header is put back on the packet data. 

An Example of Life of n Packet Under the Forwarding Engine 

l« is now i„ order ,o explain examp.es of a simp.c and a complex packet through the Forwarding Engine of 
• he invention. On pow cr „„ Fif . 19 . 3 „ 2 - Moeki of , he ^ ^ ^ ^ ^ ^ ^ ^ 

offset ,.,o, wi„ evemu ,„ y f o rwnrJ .„ p3cke „ tQ , 1)e CoWfo , Qn Ne|w<jrk car(j ^ ^ a m fou(e 

of a.l unrecognised packets. Sof.warc h responsih.e of setting up the dcfau.t route. The way in w Ilich the various 
dements *, updated into ,hi< p , rsc fraph mcmory wi „ „ c ^ ^ ^ ^ ^ 

packet . , 

*nh mask 255.255.0 ymd a complex filter packet, aging thc 

simple IP Packet. 
Simple Multicast Packet 

On power up. ,„e entire b.ncks in the Pane Graph Memory may be assumed ,o be fll.ed with .eaf elemems 
•H3, poin, to I- offse, of F1.T which wil, rou.e ,he packeuo ,he Network Processor. Le, i, now be assumed for .his 
«amp,e. ,ha, the ingress packet has a donation IP Address of 224.5.6.7. In this case. ,he hardware will .ookup ,he 
»4* offse, in , he I - block <„ lf fi„ lookup block „ a|sQ M nodc) ^ ^ a ^ ^ ^ ^ 

end ,he se.vch and look „ P the defau., offse, in ,he 224* ,oca,io„ and ,ook up ,he *T and forward ,he packe, ,o ,he 

control processor. 
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When the control processor forward* subsequent packets of Destination IP address 224.5.6.7. it will 
generate I he graph shown in Pip. 21. 

The software firs, has in croc .he parse graph locally. 11,0 parse graph created is listed as L129-M3I. 
11.C software always look., Up the firs, block a.k.a originating node. The orfsct in .he fir« block is 224. which is .he 
firs, h> ,e of the destination IP header. It finds a default rou.c. - nn indica.ion for software .o allocate , new block for 
all subsequent by.es of the des.ira.inn IP address. Once the software hits a default route, i, knows that .his is , link 
node. r.«.n the link node onwards, the software has ,o allocate new blocks for every byte i, want, the hardware .0 
search for a ma.ched des.ina.ion IP address. Through an apprcpria.e sof.ware algori.hm. i. finds .ha. 129. 2. 131 are 
the next three available blocks to use. The software will then install con.inua.ion elemen. with BA of 2 in ,he 5* 
offset of block 129. continuation elc.nen, with BA of 131 in 6* offset of block 2. and a leaf elemen. of FLT offset 5 
a. 7* „ff. e . of b.ock 131. Once such a brand, wi.l, a lenf is created, the node link is .hen installed. The node has <o 
be installed las. in the new leafed branch. The node in this case, is a con.inua.ion elemen, with BA of 131 a, offset 
224 of the 1" block. 

The hardware is now ready for any subsequent packets with destination If address 224.5.6.7. even .hough it 
knows nothing about i.. Now . when the hardware sees the 224 of .he the destination IP address, i. goes .o .he 224* 
offset of | - block of .he parse graph and finds a con.inua.ion element with BA of 129. The hardware will ,hen go .0 
.he 5* offset (second hyte of destination IP address) of the 1 29'- b.ock and find another con.inua.ion elemen, with 
BA of 2 The hardware will ,hen go ,„ 6* offset (third byte of des.ina.ion IP address) of the 2" block and find 
another continuation clement with BA of , 3 ,. 11lc hardware will then go to 7* offset (four.h by.e of des.ina.ion IP 
addrc, , of the 1 3 1" block and find a leaf element with FLT of 3. The hardware now knows .ha, i. has comp.e.ed «he 
IP match and wi.l forward ,hc forwarding ,D in loca.ion 2 ,o .he subsequent hardware block, calling ,he end of 
packet parsing. 

I. should he noted .ha, the hardware is .imply a slave of.he parse graph pu, in memory by software. The 
<cng,h of ,he search purely depends on ,he sof.ware requirement of parsing length and memory si,e. The adverse 
effects of such parsing are si,e of memory, and search time which is directly proportional to the leng.ta or the search. 

In this case, the search will res,... in ,he hardware effecting 4 lookups in Parse Graph and 1 lookup in FLT. 

Tackct with Ma<k 255.255.255.0 
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Building upon a ihc parse graph in Fig. 20. a packet with an illustrative mask 255.255.255.0 and address of 
4.6.7.x is now installed. In this case, the software will g0 to the 4 ,H offset in ihc originating node and find a 
continuation clement with B A of 1 29. The software will then go to offset 6 in block 1 29 and find a default FLT 
offset. The software then knows that this is a link node. From now on. it has to allocate more blocks in the parse 
graph, such as block 2. At offset 7 of Hock 2. it will install a leaf clement with FLT 3. Tlicn it will install the link 
node consisting of writing a continuation element with B A of 2 at offset 6 of block 1 29. 

When the hardware receives any packet with the header 4.6.7.x. it will look into the 4 th offset originating 
node and find a continuation element with BA of 129. then look at the 6 ,h offset in block 129 and find a continuation 
element with BA 131. and then look at the leaf element at offset 7 with FLT of 3. Tins FLT will be of value 3 which 
is then forwarded to the Buffer Manager and eventually the Egress bandwidth manager. 
Packet with Mask 255.255.0.0 

This subsection will huild upon the parse graph in Fig. 20 and install a packet with an illustrative mask 
255.255.0.0 and address of 4.R.x.y. In this case, the software will go to the 4* offset in the originating node and find 
a continuation clement with BA of 1 29. The software will then go to offset 8 in block 1 29 and find a default FLT 
offset. At this time the software know, that it has to install a new FLT (say 4)offset in the Ef h offset of block 129. 

The hardware when receives any packet with the header 4.8.x.y it will look into the 4 ,h offset originating 
node and find a continuation clement with B A of 1 29 . then look at the icar element of block with FLT or 4. and 
terminate the search. In this case ihc hardware will do only 2 lookups. 
Complex Filtered Packet 

Now assume that there was a requirement to filter a packet with header 4.5.6.8.9.x.y.z. 1 1. There are no 
restrictions to the above concept of parsing the packet, and the lime it takes to parse ihc packet will increase since 
the hardware will have ,„ fC ad and compare 9 bytes. The hardware will simply keep parsing however until it sees a 
leaf element. The x.y.r. bytes arc blocks which contain continuation elements pointing to the next block with all 
continuation clcmcnis of x pointing to Mock y . all continuation elements of y pointing lo block i, and at! . 
continuation elements of r pointing to the block which has entry 1 1 as a leaf, and ihe rest being default. This is where 
the fork element comes into play and may be called up to lookup the forwarding ai the end of search 4.5.6.8. 
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Removing Simple IP Multicast Packets 

The removal of packet, >, si.n.Iar ,o the reverse of adding address to the parse graph, above explained. The 
psuedncede for removal in this embodiment is as follows: 

Walk down in end of leaf remembering each block address and offset in block. 

FOR ( From Leaf node to originating node) 
IF ( only clement in block) 

set default FLT offset at the previous NODE offset address 
free the last block 
go to previous block 

ELSE 

set default FLT offset at last leaf, 
exit 

ENDIF 
END FOR 

Egress Bandwidth Manager 

Every I/O Module connects a NcoN P on ,o or, or multiple physical pons. Each I/O Module supports 
multiple traffic priorities injected via a .ingle physical NcoN Port. Each traffic priority is assigned some bandwidth 
by a network manager, a, illustrated in Fig. U. being labeled as the "QoS (Packet & Cell)". The purpose of this 
section is to define how bandwidth h managed on multiple traffic profiles. 
NcoN Queuing Concepts 

7*e goal of NcoN Queuing of ,hc invention. thus, is .. be able ,o associate a fixed configurable bandwidth 
with every pnori.y queue and also „, ensure maximum , ine uliIiz3tion . TraditionaUy. bandwid.h enforcement i, done 
in systems bv allocating a fixed number of buffers per priority queue. This means ,ha, the enqucing of da„ on the 
priority qucucs cnforccs bnn(jw . „„ Mlocm . on When Nffcrs of a cert3 . n ^ (hen ^ ^ ^ ^ 

dropped (hy no. enqueuing data o„ ,„„ qu euc). this being a rough approbation of .he idea, requirement. 

•n-cre arc many real life analogies to understanding the concept of QoS of .he presem invention, e.g. cars o» 
« hi ? h,oy u ith mu.tip.c cntrv ramps or moving objects «n a muiti-channclcd conveyor in a manuring operation. 
For our purposes. Ic. us examine the simple case of "cars on a highway". Assume ,ha< 8 ramps were .0 merge in.o 
one hoc a, some ^n, on (he ,, iclmay . In rfM , irc cx[ie , m ^ ^ ^ cou)d ^ 
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But if managed correctly (i.e w,,h the right QoS). then the single highway lane can be utilized for maximum 
efficiency. Or,e way to manage .his How i s ,o have no control, and have a be serviced on a first come, first serviced 
method. This means .ha. there is no distinction between an ambulance on one ramp and someone headed .o the beach 
on another ramp. Bui in the me.hodology of .he invention, we define certain preferemial characterises for certain 
entry umps. There are different mechanisms .ha. we can create. One is to send one ear from each entry ramp in a 
round robin fashion, i.e. each ramp is equal. This means counting cars. But if one of .hese "cars" .urns out to be . 
.ractor trailer with 3 trailers, then in fact equal service is not being given to all entry ramps as measured by the 
amoun, of highway occupied. In fact if one entry ramp is all tractor trailers, then the backup on the other ramps could 
be very significant. So it is important , 0 measure .he size of the vehicle and its impor.ance. The purpose of the 
"traffic cop" (aka QoS manager) is to manage which vehicle has .he right of way. based on size, importance and 
perhaps lane number. The ".raffle cop " can. in fact, have different instructs every other day on .he lane entry 
characteristics based on what the "town hall manager" aka network manager has decided. To conclude the concept 
of QoS understanding. QoS is a mechanic which a.lows certain datagrams,., pass through queues in a controlled 
manner, so as .0 achieve a deterministic and desired goal, which may vary from application .o application e.g. 
bandwidth utilization, precision bandwidth allocation, low latency, low delay, priority etc. 

The NeoN Queuing of .he invention handles .he problem direc.ly. Neon Queuing views the buffer 
allocation as an orthogonal parameter to the Queuing and bandwidth issue. NeoN Queuing will literally segment the 
physical wi,e into smaN time units called "Time Slice" fas an example, approximately 200 nanoseconds on OC48 - 
time of M byte packet on an OC48). Packets from the baek-plane are pu, into .he Priority Queues. Each time . 
packet is extracted from a queue, a .imes.amp is also tacked along with .ha. queue. The time s.amp indicates distance 
in time from a -Current Time Counter' in Time Slice Units, and when the next packet should be dequeued. The 
'distance in time' is function of a) packet size information coming in from the back plane, b) the size of Slice Time 
i««lf and c) the bandwidth nl.oca.cd for the priority queue. Once a pneke. is dequeued, another counter is updated 
which represents the Nex, Time to Dequeue (NTTD) - such purely , function of the , in of the packet jus. de 
queued KTTD is one for ccl.-based card, because all packets are the same size and fit in one buffer. This re...y 
proves tha, the NeoN Egress Bandwidth Manager is monitoring the line to determine exactly what „ex, ,o send. This 
mechanism, therefore, is a bandwidth manager rather than just a dequeuing engine. 
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T* NcoN Q UCU „ ? „r lfce P , cscm inventi0 , morcovc , m3> . hc lhough( of as jchemc 
b,-d.«.h for d.ffcrcn, pr , or i,ic, win , p riorily qucuing for ABR g. ( ^ ^ 

adv anIa?cs of lhe NcoN Q ucuin? « , ha , wilhiB lhe TDM mcc ha nism . haBdwidlh , C3lcu|31c<J ^ on 
co Unt - bu , on ^c, H,e rf, e -. , ,„< ?ranularity is , much ^ ^ ^ ^ ^ ^ 

Ifue bM Ccuh,™ Mher lhan si.nuio.cd/^o^.ion, T,c second "NcoN Advamagc . h ^ the 

con,,,. ™ u r,* , ncc ,„c H, n d. idIh caIcula(ions <„ priority Qtlcu „ arc nol „ alI _ d on buffcr 3|iocatjonj 

tn Neon Q u c U ,n P . nIhe , lhe M(Slh , ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 

1*. , yP c of ba.d.id.H m ,, a?emcn! is absoIu(e|y neccsnry uhen ^ ^ h;gh |in£ jpceds |o ^ ^ 

utilization high 

Mathematics Used during Queuing 

First we .-ill develop ,hc variables and constants being used ,n the uhi 



mate mathematics. 



Symbols 


| Description ~~ 


TS 


-Time Slice of bandw.dth on wire used ior calculations. (200nSec 
for OC48). 


NTTS 


Next Time To Send, mis number in units of Ts representing a 
address to de-queue from current lime. 


BitTimc 


T.me period of a s.ngle bit on lhe wire of the current I/O module 


An 


Delay factor in Number of TS. representing bandwidth " 
calculations set by Network Manager, for priority Queue n. 


BWn 


Bondwdth of Queue n ,n Percentage as entered or calculated by 
the CPU Software. 


Tn 


Number of Priority Queues. ' " 


TBW 


Total Bandwidth of the wire ~ 


NTTD 


Next Time To Dequeue! — 


CT 


LurTcni Time in TS units. — ~ 
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Consider firs. ,hc user intcrf.ee level .o sec how bandwidth is a.locatcd amongst various priorities. the user 
is normally given the joh of divid.ng 1007. bandwidth amongst various priorities. The user could also be presented 
with breaking up the entire bandwidth i„ bils per second (as an sample for OC48. i, would be 2.4 G bi,s). In either 
case, some CPU software calculates a number pair. priority-An. from %-priority or mBits/sccpriority. Since ,he CPU 
is doing .his ca.cula.ion. i, can be easily changed based on ,he UO module, The Bandwidth Manager does no, need ,o 
know about ,h= I/O module , ype . only earing about the priority-An pair. Thus if a user connected ,o the NeoNpor, 
,ha, cannot handle data a, full line rate, the CPU can change this value to adjus, for the customer requirements. 
An = 100/BWn 



(I) 



queue ^ ^^l^T^^ZZ cT" ^ ^ °" ^ ~'« ^ *»" 
NTTS„ = ((Packet Byte Count • BitTimc) / (TS)) • A„) + NTTS. , (2) 

■00- of a TS lime, as lime a^, ^^^^S^S^ST' " 

Next Time To Dc-qticuc is the lime thm ur tnrt a- «. 
This is Pr.ari.y based on I current time n^ber ^^^^T ^ 

N7TD„ = ((Packet Byte Count * BitTimc) mod(TS)) + CT {3) 

Queuing Processing 

i< now in order to decide the processing needed ,o queue addresses from the back-plane on to the Priority 
Queues, F, P 24. ,„ich depicts „,c ov C ra„ queuing and scheming process. Contro. Data, which inc.udes datagram 
dresses, from the "NeoN Contro. Data Switch", is sorted into priority queues based on the QoS information 
embedded in the contro. Data, by the Queue Engine. The Scheming Engine operation is rendered independent of 
the Queue Engine which schcdu.es datagram addresses through use of the novc. algorithms of the invention toed 
further he low. 

The queuing Engine has ihc following mskv 

R P-inrit/qucucs ^H^C^^ "° ^ « 

u-atcrm,,;, se, tZ"^ M " 10 pressure on the ingress based on 

Drop Packets Star. Dropping packets when the Priority Queues arc full. 
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Tor each Priority Queue P.. there will be 3 'head pointer - pHcndV and a 'tail pointer - pTail,'. Input Fifo 
feeds the priority Queue* with buffer address from the bnck-pbnc. Additionally, there is a forward. For OC48 
rates, and aiming 64 hytc packet* as average size packets, the following processing will be done in about 
200nSccs The preferred pseudo code of the invention for the En queue Processor is as follows? 

Rend input Fifo. 

Find priority of I lie packet 

Iffroom on queue) 

move buffer from Input Fifo to *pTail„ priority queue. 

Advance pTail rt . 

update statistics 

increment buffer count on queue 

IF(packct count on >= watermark of that queue) 

set back-pressure for that priority 

update statistics 

F.NDIF 

ELSE 

ninvc buffer from Input Fifo to drop queue. 
Update statistics 

ENDIF 

The verbal explanation of the psucdocode listed above. As each control packet is read from the 'Neon 
Control Data Switch' it is put onto one of N queues after it is verified for physical space available on the queue. If 
there is no room set on the queue the data is put on a drop queue, which allows the hardware to return addresses back 
to the originating port via the *NcoN Control Data S wiich\ Also a watermark is set. per queue, to indicate to the 
ingress to filter out non T rcfcrrcd traffic. This algorithm is simple but needs to be executed in one TS. 
Scheduling Processing 

TOs sec.inr, .-ill lis, ,hc ,lg„riih.n used .ode-queue address from Triori.y Queues Pn onto the ou.pul fifo. 

This calculation also has to be done during one TS. 

Wail : here i ill CT == NTTD AND no back pressure from ou.pul fifo. // sync up 

X = FALSE „ -Li 

rriTS , „ n // some vnnablc. 

FOR (all P. . High if) Low) 

lF(pHcad„ != pTail,,) 

IF(CT >= NTTS,) 

Dc-Qucuc (pHcad,,) 

Calculate new NTTS„ // scc cqualicui (2) above . 

Calculate NTTD „ scc cquntion (3) ftbovc 

update statistics 

X=TKUE 

ENDFOR 

ENDIF 
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END IF 
ENDFOR 
IF ( X == FALSE ) 

FOR f all P*. High to Low) 

IFfpHcad, != pTaiU) 

DcQueuc (pTailJ 
update statistics 
X=TRUE 
ENDFOR 

ENDIF 
ENDFOR 

ENDIF 

IF(X=FALSE) 

update statistics 

ENDIF 
Update CT 

Tne function Dc-Qucuc is conceptually 3 simple routine, listed below.- 

Dc-Queuc(Q„) 

•pOut P utQTaiU + = *pHcad„++ 

TV cplona.ion of ,hc psucdocodc lis.cd above is ,h« .here arc ,wo FOR loops in ,he algori.hm - ,he firs, 
FOR .oop enforcing ,„c co.nmiucd bandwid.h ,o ,hc queue. and ,he second FOR .oop serving f of bandwid.h 
utilization, sometimes called aggregate bandwidth FOR Loop. 

E^ining firs, .he Ccmmiucd FOR Loop. ,he queues ore checked from .he Highes. Priori,, Queue ,o ,he 
Lo.es, Priori,, Queue for a v a i,ah,c da 13f mm ,n schedu.c. ,f . „ueue ha, avai , ab , e dalagram . (he alg(Jri(hm wj| , 
check ,o see if lhe Queue, Time h,s ,« dequeue. b y comparing i.s NTTS„ ogains, CT. ,f ,he NTTS, h« f al ,en behind 
CT. ,.,en .he queue is Dequeued: o,he. w,e. ,he search goes on for lhe nex, Queue un.i, al. queues are checked. If . 
do, fro, a queue ,s scncdu.cd , ou.. a new NTTS. is c al c U .a,d for lha . queue and . ^ „ a , ways 
when onv queue is dequeued. When o Nc.work manager assigns weigh, for .he queues, .he sum of a„ weights shou.d 
« be ,00ft. Since NTTS. is b,sed on digram s,e. .he o,p, data per queue is a vcry impIenientation 
the bandwidth set by the manager. 

U, us now e „ m i ne lhe Aggrc?alc FOR Loop. This .oop is only execu.cd when no queue is de-queued 
during .he Commiued FOR loop. ,„ o.her words on,v one de-queue option is performed in one TS. In ,his FOR 

*o. in .Ms FOR Loop for one of .wo reasons: ei.he, .here was no da, in a „ ,he q UCUe , or .he NTTS. of a „ queues 
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NTTS. was noi reached for nil queues then the aggregate will find (he highest priority such queue and dc-queuc it. 
also in ih.it case it would update NTTS* and calculate NTTD. 

■The algori.hm has bull, in credits for q ueuc that do no. have data lo dequeue in their time slot: and debits 
for data .ha, is de-qucucd in the Aggregate Loop. These credits and debits can accumulate over large periods of time. 
The debit and credit accumulation time is a direct function of the si« of NTTS. field in bits, for example a 32 bil 
number would yield 6 minutes in each direction at using 160 nSec as TS <2 ): • !60nSec). Each individual queue 
could be configured to loose credits and/or debits, depending on the application this algorithm is used. For example 
if the algorithm was to be used mainly for CBR type circuits one would want to clear the debits fairly quickly, where 
as for bursty traffic they could be cleared rather slowly. The mechanism for clearing debits/credits is very simple, 
asynchronously setting NTTS. to CT. If NTTS. is way ahead of CT. Queue has build a lot of debit, then setting the 
NTTS. to CT would mean loosing all the debit. Similarly if NTTS. had fallen behind CT. Queue has build a lot of 
Credit, then selling NTTS, lo CT would mean losing all the credit. 

Example of Implementing CBR Queue Using the Algorithm 

It is now appropriate to examine hrm- to build a CBR queue out or the algorithm listed above, again 
referencing Fig. 24. Le. it be « sumc d .ha. ,he output wire is running a. OC48 speeds (2.4Gbi.s Per second) and that 
Queue 1 (hiel.es. Priority Queue) has been assigned ,o be .he CBR Queue. The way we configure .he weigh, on the 
CBR queue is configured by summing all the input CBR flow bandwidth requirements. Forsake of simplicity .here 
are 100 Hows going through the CBR Queue, each with a bandwidth requirement of 2.4 Mbits per second. The CBR 
Queue bandwidth will then be 2.4Mhi.s/sec Times 100. i.e. 240Mbits per second (i.e. 107.). In other words, 

QRATE, h = I Ingress Flow Bandwidth. 

A.= 100/10= 10. Based on Equation I 

NTTS. would result in 10 every time a 45 byte datagram is dequued. - Based on Equation 2. 
. NTTS. would result in 20 every time a 90 byte datagram is dequued. - Based on Equation 2. 

NTTD would result in I every time a 45 byte datagram is dequeued. - Based on Equation 3. 

NTTD would result in 2 every time a 90 hyte datagram is dequeued. - Based on Equation 3. ' 

•this show, , h « the queue will he dequeued very timely; based on datagram si,e and the % of bandwidth 
al.oca.cd to ,he queue. This algorithm is independent of wire speed, making i, very scalable, and can achieve very 
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high da.a speeds. This alogoruhm also takes datagram size into account during scheduling regardless 0 r a the 
datagram being a cell or a packet. So long as the network Manager sets the weigh, of , hc qucue „ |he Jum Qf 
ingress CBR flow bandwid.h. the algorithm provides the scheduling very accurately. 

Example of Implementing UBR Queue Using the Algorithm. 

It is very simple to implement » UBR queue using .hit algorithm. UBR standing for the queue 
which uses the left over bandwidth on the wire. To implement this type of queue, one of N queues with 0% 
Bandwid.h. and then this queue is dequeued when there is literally no other queue to de-queue. The NTTS will be 
se, so far in the future thai after the algorithm dequeues one datagram the next one is never scheduled. 
QoS Conclusion 

As has been demonstrated, the algorithm of the invention is very precise in delivering bandwidth, .nd its 
granularity is based on the si,c of TS being independent ofCell/Packe, information, and also provides all of the 
ATM services required: implying no. only packets also enjoy the ATM services but cells and packets coexist on the 
same interface. 

Real Life Network Manager Examples 

This section will now consider different Network Management bandwidth management scenarios^ well 
handled by the invention. Inso far as the NeoN Network controller is concerned, there are n queues egress (as an 
example i, could be 8,. each queue being assigned a bandwid.h. The Egress Bandwidth Manager wi„ deliver ,ha, 
percentage very precisely. The Network Manager can also decide no, to assign .00* of ,he bandwidth to all queues, 
in which ease the left over bandwidth wi„ simply be distributed on a high to low priority basis. Besides these two 
levels of control, the Network Manager can also examine statistics per priority and make strategic statistical 
decisions on it own and change percentage allocations. 
Exemplary Case 1 : Fixed Bandwidth 

in this scenario. .00% of the bandwidth is divided into all queues. If .11 queues are Ml at a |. ,imes;hen the 
queues wi„ behave exactly like Fair Weighted Queuing. The reason for this is tha, - the Egress Bandwidth Manager 
wi.l deliver the percentage of the line bandwidth as requested by the Network Manager, and sinee the queues are 
never empty, the egress bandwidth does no, have time to execute the second FOR loop (Aggregate Loop, above 
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discussed. 

may be serviced ahead of iu lime without a charge against its bandwidth. 

A< an cample,,!. Network Manager decided to al.oca.e .2.5 % bandwidth to every one of the eight 
queues, then the Network Manager has ,„ provide to the Egress bandwidth Manager- 
-VPriority List of all A. one for each priority. 

Bi. Time Based on I/O Module Egress Bandwidth Manager is running on. 

Tor a bandwidth of ,2, ». wou.d calculate to be 8.00 (,00/12,, For a OC<8 Bi, Time wouId caIcu , a , e 

lobe 402 pscc. 

Exemplary Case 2: Mixed Bandwidth 

In e, lmp ,e. no, a„ of .he h,nd,id,h is div idcd in(0 aII of lhc qucu „ ,„ ^ ^ ^ ^ ^ 
*--■*.„ on ,„. qucu „ is nol 100 . of „ )c bM availabic £gress bandwid(h ^ ^ ^ ^ 

bandwidth. 

Exemplary Case 3: No Mixed Bandwidth For All Queues 

«- «* -a„o,0» „ a„oca,cd ,s f.cd bandw idlh for alI qucu „. ^ ^ ^ ^ ^ ^ ^ 
pnor„i,ed queuing. The first For Loop lis.ed in section 0 Scheduling . wi„ considered as NOP. 
Exemplary Case 4: Dynamic Bandwidlh 

I- «* U—io, the Network Manager may i„i,ia„y eome up with No Mi,ed Bandwidth for all Queues 
«- -ncn, as i, staru to bui.d committed bandwidth circuits., may ereate fi «d bandwidth qu eue, The sum of the 

^ PC, Tnc g fl „„, 3rity of lhe aI|ocalab , e e . cjj ban<Jwid|h js )jr?c|y dependem ^ ^ ^ ^ ^ ^ 
^dcp,h.Mane Mmp ,c. i, may be assumed ^^^^^^^^ 
percent, .nd would calcu.a.e ,o be 2<0kH: fo, an OC48 Hne and 62 kH: for ,„ OC.2 «i„ e 
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invention is not limited to these c.ncx 

Further n^nc-i™ „ iH occ Ur ,o , hose sklIIcd ;„ lh , amJ ^ ^ ^ ^ ^ 

.he sp.n, and score of ,he inveni,on a, defined in the appended claim,. 
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, A « o, „„«„,„, „„ ui „ cd ,„ ^ ^ ^ ^ ^ _ ^ ^ 
- «• -P.., w „„, „„ pri![ , Jnp , y ,„ t ta ^ ^ tt|h ^ ^ ^ ^ ^ 

— — ^—^-^ — ^^ 

— - — ■ ~- 4 — , is „_ d „ . 

delay variation. 

^^^^^^ 

Inc common algorithm, 
cell data and data packets. 

„ w , cbc _, 

'--■--'-•■■-*-p.*~ ta .-.-.H-,- w . 1 n. 1 . 1 . MH1Ilifclrtlll . 

space crisis on the queue. 

relumed by .he ,w|,ch lo .he Ingres, of H,e network. 

non picfencddata trnrOc. 

bated upon lime slicing the bandwidth. 
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10. A method as chimed in clam, 9 wherein the network manager dynamically v..ri«-s the bandwidth 

requirement. 

11. A method of processu,,; mfurmal.on contained in data cells and data packet* nvcived at ihe ineress of a 
data networking system, that compns.-s. applying both th,- received data cells and data packets to a common 
daU forwarding and routing switch; managing both cell and packet data switching in the common switch using 
common hardware, common quality of service algorithms, and common forwarding algorithms; and 
controUin C the packet sw.lching independently of and without interfering with the . ell data switching. 

12. A method of processing packets of information from a forwarding switch an J quo,,,. managing the 
forwarding of the same, that comprises, as each packet is read from the switch, putli,,,; the same inU , „ m . of „ 
plurality of queues after it is verified that available physical space ex.sls in the queue; placin,., the packet 
information in a drop queue should there be no such space and return,,,,; the packet information through the 
switch- setting a watermark for each queue to enable the filtering of non-preferred information traffic; und 
allocating for different priorities by packet byte si 2 c and based upon time slicing the bandwidth. 

13. A system architecture apparatus for simultaneously processing information , untamed in d.,L, cells and 
data packets received at the in i: ress of a data networking system, said apparatus bavin,, in . ombination. means 
for applyinc both the received data cells and data packets from the ingr^s to a common data switch within the 
system; means for controlling the switch for cell and packet indiscrimmately. for forwarding by a common 
algorithm based on control informal™ contained in the cell or packet and without transforming packets ,nlo 
cell* and means for contr,.|lin r . with a common bandwidth management algorithm both cell and packet data 
forwarding without impacting the correct forwardin,; characteristics of either. 

14. Apparatus as claimed ,n clam, 13 wherein the cell and packet control informal,,,,, ,s processed in a 
common forwardin,; engine with common algorithms, independent of context-sensitive information contained 

in the cell or packet. 

15. Apparatus as claimed ,n Jain, U wherein means is provided for parsing the information from the 
forwards engine to a network egress queue mana,;er and then.,, to a network egress transmit facility, and in a 
manner such as to provide minimal cell/packet delay variation. 

16. Apparatus as claimed ,„ claim 15 wherein quality of service informahon ,s included in the information 
passed from th, forwardin, engine and managed by the queue manager for both cells and packets 

simultaneously based upon Uir tommt.n algorithm. 
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forwarding boih cells and data packets. 

" ""*"«' * »-,.., employs ,„„ „ 

physical space exists on the queue. 

put m a drop queue and returned hy ,l,c switch to .he ingress of ,he network. 

20. Apparatus as claimed in clnim 19 wherein a watermark is set f ol -«,h 

"mark ,s set for each queue lo mstruct such ingress to fil.er 

out non-preferred data traffic. 

packet byte size and based upon time slicing ,he bandwidth. 

ZZ.Apparirus fc ^ ^ ^ 

arbitrary size. 

24.Apparatus as chimed in chim \4 wherein hnw^« .k * 

herein. bcl wecn .he .ngress and the ,wi,ch. a VCl function/ assembly is 



interfaced 



25. Apparatus as claimed in claim 24 wherein said a«, m M.. 

sa,d assemb.y connects no, on.y to the switch bu, a.so to a header 
lookup and forwarding engine for both the cell and packet data- with the 

P with the eng.neconnccng through a control data switch 

- a ouamy of service managing mCule IO , buffe , inpu , tjng ^ ^ ^ ^ ^ 

26. Appara.us as claimed in claim 25 wherein the buffer Tccd, a cell data Vr.K • • • 

■ccos a cell data VC shaping circuit that connects with the 



system egress. 
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